Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Inference Client] Factorize inference payload build #2601

Merged
merged 14 commits into from
Oct 15, 2024

Conversation

hanouticelina
Copy link
Contributor

This PR is a first attempt to factorize the payload build in multiple InferenceClient methods. In fact, there's some repetitive logic across several methods for handling inputs and parameters, so here we introduce a new (private) helper function to factorize this logic.

Key changes

  1. Adding a helper function that:
    • handles both raw content (images or audio) and string/dict inputs uniformly.
    • base64 encodes raw content when at least one parameter is present.
    • filters out None values from the parameters.
    • .. and returns an _InferenceInputs object containing the json payload and raw data if any. (I don't have a strong opinion on this, we can also return a Tuple instead).
  2. A unit test to verify the correct behavior of the refactored logic.

⚠️ These changes only affect the internal/private functionality of the InferenceClient and AsyncInferenceClient.

@hanouticelina hanouticelina requested a review from Wauplin October 11, 2024 13:35
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hanouticelina! I'm sorry I realized to late that I commented on both _client.py and _async_client.py but all comments applied to both (since it's autogenerated). My main concern is about determining if a task expects binaries as input or not (see below). Let me know if you have other ideas on how to fix it. I'm half-happy about the suggested solution of expect_binary: bool 😄

src/huggingface_hub/inference/_generated/_async_client.py Outdated Show resolved Hide resolved
src/huggingface_hub/inference/_generated/_async_client.py Outdated Show resolved Hide resolved
tests/test_inference_client.py Outdated Show resolved Hide resolved

def is_raw_content(inputs: Union[str, Dict[str, Any], ContentT]) -> bool:
return isinstance(inputs, (bytes, Path)) or (
isinstance(inputs, str) and inputs.startswith(("http://", "https://"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an annoying part 😕 Depending on the context, inputs.startswith(("http://", "https://")) should lead to different behavior:

  • in image_to_text, a url as input must be passed as post(data=...) so that the url is loaded and sent to the inference server
  • in feature_extraction, a url as input should be passed as post(payload={"inputs": ...} => a URL is a special case of string input, but still a valid one

Since _prepare_payload is agnostic of the task, it can't know in which case we are.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of modifying the method signature to

def _prepare_payload(
        inputs: Union[str, Dict[str, Any], ContentT],
        parameters: Optional[Dict[str, Any]],
        expect_binary: bool,
    )

?

For tasks that expect a binary input (image_to_*, audio_to_*), you pass _prepare_payload(..., expect_binary=True).

This way you could have a logic like this:

is_binary = isinstance(inputs, (bytes, Path))

if expect_binary and not is_binary and not isinstance(inputs, str):
    raise ValueError(...)  # should be a binary or at least a string (local path or url)

if expect_binary and not has_parameter:
    return _InferenceInputs(raw_data=inputs)

if not expect_binary and is_binary:
    raise ValueError(...)  # cannot be a binary

# else set as "inputs" in a json payload
...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hum yes you're right! actually I did not update image_to_text as we don't have any parameters or logic to either send a json payload or a raw data:

response = self.post(data=image, model=model, task="image-to-text")
output = ImageToTextOutput.parse_obj(response)
return output[0] if isinstance(output, list) else output

but of course your point is totally valid.
Having a flag seems to cover all the cases. In the beginning I though about having a Input type enum and add a input_type arg to _prepare_payload() but it's simpler to just use a expect_binary flag. I don't have a better solution either for now 😕
I will fix the suggestions and I will get back to this :)

src/huggingface_hub/inference/_client.py Outdated Show resolved Hide resolved
src/huggingface_hub/inference/_client.py Outdated Show resolved Hide resolved
src/huggingface_hub/inference/_client.py Outdated Show resolved Hide resolved
@hanouticelina
Copy link
Contributor Author

thanks @Wauplin for the review! I addressed your suggestions and I ended up adding an expect_binary flag as it's the simplest way to handle the special case of image and audio (path, URL and binary) inputs :)

@hanouticelina hanouticelina requested a review from Wauplin October 14, 2024 10:48
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! I have a question related to question answering and then we should be good to merge

src/huggingface_hub/inference/_common.py Outdated Show resolved Hide resolved
src/huggingface_hub/inference/_common.py Outdated Show resolved Hide resolved
src/huggingface_hub/inference/_common.py Outdated Show resolved Hide resolved
@hanouticelina hanouticelina requested a review from Wauplin October 14, 2024 14:16
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been a bit picky on corner cases here but I do think it's worth it 😇

src/huggingface_hub/inference/_common.py Outdated Show resolved Hide resolved
src/huggingface_hub/inference/_common.py Outdated Show resolved Hide resolved
tests/test_inference_client.py Show resolved Hide resolved
@hanouticelina hanouticelina requested a review from Wauplin October 15, 2024 10:56
Copy link
Contributor

@Wauplin Wauplin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hanouticelina! That should make the inference client more reliable on corner cases in the future. And reduce the amount of duplicate code 😄

src/huggingface_hub/inference/_client.py Outdated Show resolved Hide resolved
src/huggingface_hub/inference/_generated/_async_client.py Outdated Show resolved Hide resolved
@hanouticelina
Copy link
Contributor Author

thanks @Wauplin! I think we're good to merge this one

@hanouticelina hanouticelina merged commit a4bc2e5 into main Oct 15, 2024
19 checks passed
@hanouticelina hanouticelina deleted the factorize-inference-payload branch October 15, 2024 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants